Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Nat Genet ; 56(1): 162-169, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38036779

RESUMO

Fine-mapping aims to identify causal genetic variants for phenotypes. Bayesian fine-mapping algorithms (for example, SuSiE, FINEMAP, ABF and COJO-ABF) are widely used, but assessing posterior probability calibration remains challenging in real data, where model misspecification probably exists, and true causal variants are unknown. We introduce replication failure rate (RFR), a metric to assess fine-mapping consistency by downsampling. SuSiE, FINEMAP and COJO-ABF show high RFR, indicating potential overconfidence in their output. Simulations reveal that nonsparse genetic architecture can lead to miscalibration, while imputation noise, nonuniform distribution of causal variants and quality control filters have minimal impact. Here we present SuSiE-inf and FINEMAP-inf, fine-mapping methods modeling infinitesimal effects alongside fewer larger causal effects. Our methods show improved calibration, RFR and functional enrichment, competitive recall and computational efficiency. Notably, using our methods' posterior effect sizes substantially increases polygenic risk score accuracy over SuSiE and FINEMAP. Our work improves causal variant identification for complex traits, a fundamental goal of human genetics.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Teorema de Bayes , Herança Multifatorial , Algoritmos
2.
medRxiv ; 2023 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-38106023

RESUMO

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

3.
Nat Med ; 29(11): 2785-2792, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37919437

RESUMO

Genome-wide association studies (GWASs) have provided numerous associations between human single-nucleotide polymorphisms (SNPs) and health traits. Likewise, metagenome-wide association studies (MWASs) between bacterial SNPs and human traits can suggest mechanistic links, but very few such studies have been done thus far. In this study, we devised an MWAS framework to detect SNPs and associate them with host phenotypes systematically. We recruited and obtained gut metagenomic samples from a cohort of 7,190 healthy individuals and discovered 1,358 statistically significant associations between a bacterial SNP and host body mass index (BMI), from which we distilled 40 independent associations. Most of these associations were unexplained by diet, medications or physical exercise, and 17 replicated in a geographically independent cohort. We uncovered BMI-associated SNPs in 27 bacterial species, and 12 of them showed no association by standard relative abundance analysis. We revealed a BMI association of an SNP in a potentially inflammatory pathway of Bilophila wadsworthia as well as of a group of SNPs in a region coding for energy metabolism functions in a Faecalibacterium prausnitzii genome. Our results demonstrate the importance of considering nucleotide-level diversity in microbiome studies and pave the way toward improved understanding of interpersonal microbiome differences and their potential health implications.


Assuntos
Microbioma Gastrointestinal , Microbiota , Humanos , Microbioma Gastrointestinal/genética , Índice de Massa Corporal , Polimorfismo de Nucleotídeo Único/genética , Estudo de Associação Genômica Ampla , Bactérias/genética
4.
Commun Biol ; 6(1): 277, 2023 03 16.
Artigo em Inglês | MEDLINE | ID: mdl-36928598

RESUMO

Expanding the arsenal of prophylactic approaches against SARS-CoV-2 is of utmost importance, specifically those strategies that are resistant to antigenic drift in Spike. Here, we conducted a screen of over 16,000 RNAi triggers against the SARS-CoV-2 genome, using a massively parallel assay to identify hyper-potent siRNAs. We selected Ten candidates for in vitro validation and found five siRNAs that exhibited hyper-potent activity (IC50 < 20 pM) and strong blockade of infectivity in live-virus experiments. We further enhanced this activity by combinatorial pairing of the siRNA candidates and identified cocktails that were active against multiple types of variants of concern (VOC). We then examined over 2,000 possible mutations in the siRNA target sites by using saturation mutagenesis and confirmed broad protection of the leading cocktail against future variants. Finally, we demonstrated that intranasal administration of this siRNA cocktail effectively attenuates clinical signs and viral measures of disease in the gold-standard Syrian hamster model. Our results pave the way for the development of an additional layer of antiviral prophylaxis that is orthogonal to vaccines and monoclonal antibodies.


Assuntos
COVID-19 , RNA Interferente Pequeno , SARS-CoV-2 , Animais , Cricetinae , Administração Intranasal , COVID-19/prevenção & controle , Mesocricetus , Interferência de RNA , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/uso terapêutico , SARS-CoV-2/genética
5.
Res Sq ; 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38168385

RESUMO

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

6.
Nat Genet ; 54(6): 827-836, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35668300

RESUMO

Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
7.
Genome Biol ; 23(1): 131, 2022 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-35725481

RESUMO

Genetic studies of human traits have revolutionized our understanding of the variation between individuals, and yet, the genetics of most traits is still poorly understood. In this review, we highlight the major open problems that need to be solved, and by discussing these challenges provide a primer to the field. We cover general issues such as population structure, epistasis and gene-environment interactions, data-related issues such as ancestry diversity and rare genetic variants, and specific challenges related to heritability estimates, genetic association studies, and polygenic risk scores. We emphasize the interconnectedness of these problems and suggest promising avenues to address them.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Interação Gene-Ambiente , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único
8.
Nat Genet ; 54(4): 450-458, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35393596

RESUMO

Polygenic risk scores suffer reduced accuracy in non-European populations, exacerbating health disparities. We propose PolyPred, a method that improves cross-population polygenic risk scores by combining two predictors: a new predictor that leverages functionally informed fine-mapping to estimate causal effects (instead of tagging effects), addressing linkage disequilibrium differences, and BOLT-LMM, a published predictor. When a large training sample is available in the non-European target population, we propose PolyPred+, which further incorporates the non-European training data. We applied PolyPred to 49 diseases/traits in four UK Biobank populations using UK Biobank British training data, and observed relative improvements versus BOLT-LMM ranging from +7% in south Asians to +32% in Africans, consistent with simulations. We applied PolyPred+ to 23 diseases/traits in UK Biobank east Asians using both UK Biobank British and Biobank Japan training data, and observed improvements of +24% versus BOLT-LMM and +12% versus PolyPred. Summary statistics-based analogs of PolyPred and PolyPred+ attained similar improvements.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Desequilíbrio de Ligação , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Fatores de Risco
9.
Nat Neurosci ; 25(4): 433-445, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35361972

RESUMO

The noncoding genome is substantially larger than the protein-coding genome but has been largely unexplored by genetic association studies. Here, we performed region-based rare variant association analysis of >25,000 variants in untranslated regions of 6,139 amyotrophic lateral sclerosis (ALS) whole genomes and the whole genomes of 70,403 non-ALS controls. We identified interleukin-18 receptor accessory protein (IL18RAP) 3' untranslated region (3'UTR) variants as significantly enriched in non-ALS genomes and associated with a fivefold reduced risk of developing ALS, and this was replicated in an independent cohort. These variants in the IL18RAP 3'UTR reduce mRNA stability and the binding of double-stranded RNA (dsRNA)-binding proteins. Finally, the variants of the IL18RAP 3'UTR confer a survival advantage for motor neurons because they dampen neurotoxicity of human induced pluripotent stem cell (iPSC)-derived microglia bearing an ALS-associated expansion in C9orf72, and this depends on NF-κB signaling. This study reveals genetic variants that protect against ALS by reducing neuroinflammation and emphasizes the importance of noncoding genetic association studies.


Assuntos
Esclerose Amiotrófica Lateral , Células-Tronco Pluripotentes Induzidas , Subunidade beta de Receptor de Interleucina-18/genética , Regiões 3' não Traduzidas/genética , Esclerose Amiotrófica Lateral/genética , Esclerose Amiotrófica Lateral/metabolismo , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Subunidade beta de Receptor de Interleucina-18/metabolismo , Neurônios Motores/metabolismo
10.
bioRxiv ; 2022 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-35441162

RESUMO

Expanding the arsenal of prophylactic approaches against SARS-CoV-2 is of utmost importance, specifically those strategies that are resistant to antigenic drift in Spike. Here, we conducted a screen with over 16,000 RNAi triggers against the SARS-CoV-2 genome using a massively parallel assay to identify hyper-potent siRNAs. We selected 10 candidates for in vitro validation and found five siRNAs that exhibited hyper-potent activity with IC50<20pM and strong neutralisation in live virus experiments. We further enhanced the activity by combinatorial pairing of the siRNA candidates to develop siRNA cocktails and found that these cocktails are active against multiple types of variants of concern (VOC). We examined over 2,000 possible mutations to the siRNA target sites using saturation mutagenesis and identified broad protection against future variants. Finally, we demonstrated that intranasal administration of the siRNA cocktail effectively attenuates clinical signs and viral measures of disease in the Syrian hamster model. Our results pave the way to development of an additional layer of antiviral prophylaxis that is orthogonal to vaccines and monoclonal antibodies.

11.
PLoS One ; 17(3): e0265756, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35324954

RESUMO

Numerous human conditions are associated with the microbiome, yet studies are inconsistent as to the magnitude of the associations and the bacteria involved, likely reflecting insufficiently employed sample sizes. Here, we collected diverse phenotypes and gut microbiota from 34,057 individuals from Israel and the U.S.. Analyzing these data using a much-expanded microbial genomes set, we derive an atlas of robust and numerous unreported associations between bacteria and physiological human traits, which we show to replicate in cohorts from both continents. Using machine learning models trained on microbiome data, we show prediction accuracy of human traits across two continents. Subsampling our cohort to smaller cohort sizes yielded highly variable models and thus sensitivity to the selected cohort, underscoring the utility of large cohorts and possibly explaining the source of discrepancies across studies. Finally, many of our prediction models saturate at these numbers of individuals, suggesting that similar analyses on larger cohorts may not further improve these predictions.


Assuntos
Microbioma Gastrointestinal , Microbiota , Bactérias/genética , Estudos de Coortes , Microbioma Gastrointestinal/genética , Humanos , Microbiota/genética , Fenótipo
12.
Elife ; 102021 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-34635206

RESUMO

Polygenic risk scores (PRSs) have been offered since 2019 to screen in vitro fertilization embryos for genetic liability to adult diseases, despite a lack of comprehensive modeling of expected outcomes. Here we predict, based on the liability threshold model, the expected reduction in complex disease risk following polygenic embryo screening for a single disease. A strong determinant of the potential utility of such screening is the selection strategy, a factor that has not been previously studied. When only embryos with a very high PRS are excluded, the achieved risk reduction is minimal. In contrast, selecting the embryo with the lowest PRS can lead to substantial relative risk reductions, given a sufficient number of viable embryos. We systematically examine the impact of several factors on the utility of screening, including: variance explained by the PRS, number of embryos, disease prevalence, parental PRSs, and parental disease status. We consider both relative and absolute risk reductions, as well as population-averaged and per-couple risk reductions, and also examine the risk of pleiotropic effects. Finally, we confirm our theoretical predictions by simulating 'virtual' couples and offspring based on real genomes from schizophrenia and Crohn's disease case-control studies. We discuss the assumptions and limitations of our model, as well as the potential emerging ethical concerns.


Assuntos
Doença de Crohn/genética , Fertilização In Vitro , Testes Genéticos , Modelos Genéticos , Herança Multifatorial , Diagnóstico Pré-Implantação , Esquizofrenia/genética , Simulação por Computador , Feminino , Predisposição Genética para Doença , Humanos , Masculino , Valor Preditivo dos Testes , Gravidez , Medição de Risco , Fatores de Risco
13.
Nat Commun ; 11(1): 6258, 2020 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-33288751

RESUMO

Despite considerable progress on pathogenicity scores prioritizing variants for Mendelian disease, little is known about the utility of these scores for common disease. Here, we assess the informativeness of Mendelian disease-derived pathogenicity scores for common disease and improve upon existing scores. We first apply stratified linkage disequilibrium (LD) score regression to evaluate published pathogenicity scores across 41 common diseases and complex traits (average N = 320K). Several of the resulting annotations are informative for common disease, even after conditioning on a broad set of functional annotations. We then improve upon published pathogenicity scores by developing AnnotBoost, a machine learning framework to impute and denoise pathogenicity scores using a broad set of functional annotations. AnnotBoost substantially increases the informativeness for common disease of both previously uninformative and previously informative pathogenicity scores, implying that Mendelian and common disease variants share similar properties. The boosted scores also produce improvements in heritability model fit and in classifying disease-associated, fine-mapped SNPs. Our boosted scores may improve fine-mapping and candidate gene discovery for common disease.


Assuntos
Doenças Genéticas Inatas/genética , Predisposição Genética para Doença/genética , Desequilíbrio de Ligação , Mutação de Sentido Incorreto , Polimorfismo de Nucleotídeo Único , Alelos , Estudo de Associação Genômica Ampla/métodos , Humanos , Aprendizado de Máquina , Análise da Randomização Mendeliana/métodos
14.
Nature ; 588(7836): 135-140, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33177712

RESUMO

The serum metabolome contains a plethora of biomarkers and causative agents of various diseases, some of which are endogenously produced and some that have been taken up from the environment1. The origins of specific compounds are known, including metabolites that are highly heritable2,3, or those that are influenced by the gut microbiome4, by lifestyle choices such as smoking5, or by diet6. However, the key determinants of most metabolites are still poorly understood. Here we measured the levels of 1,251 metabolites in serum samples from a unique and deeply phenotyped healthy human cohort of 491 individuals. We applied machine-learning algorithms to predict metabolite levels in held-out individuals on the basis of host genetics, gut microbiome, clinical parameters, diet, lifestyle and anthropometric measurements, and obtained statistically significant predictions for more than 76% of the profiled metabolites. Diet and microbiome had the strongest predictive power, and each explained hundreds of metabolites-in some cases, explaining more than 50% of the observed variance. We further validated microbiome-related predictions by showing a high replication rate in two geographically independent cohorts7,8 that were not available to us when we trained the algorithms. We used feature attribution analysis9 to reveal specific dietary and bacterial interactions. We further demonstrate that some of these interactions might be causal, as some metabolites that we predicted to be positively associated with bread were found to increase after a randomized clinical trial of bread intervention. Overall, our results reveal potential determinants of more than 800 metabolites, paving the way towards a mechanistic understanding of alterations in metabolites under different conditions and to designing interventions for manipulating the levels of circulating metabolites.


Assuntos
Dieta , Microbioma Gastrointestinal/fisiologia , Metaboloma/genética , Soro/metabolismo , Adulto , Pão , Estudos de Coortes , Feminino , Voluntários Saudáveis , Humanos , Estilo de Vida , Aprendizado de Máquina , Masculino , Metabolômica , Pessoa de Meia-Idade , Hepatopatia Gordurosa não Alcoólica/genética , Oxigenases/genética , Padrões de Referência , Reprodutibilidade dos Testes , Estações do Ano
15.
Nat Genet ; 52(12): 1355-1363, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33199916

RESUMO

Fine-mapping aims to identify causal variants impacting complex traits. We propose PolyFun, a computationally scalable framework to improve fine-mapping accuracy by leveraging functional annotations across the entire genome-not just genome-wide-significant loci-to specify prior probabilities for fine-mapping methods such as SuSiE or FINEMAP. In simulations, PolyFun + SuSiE and PolyFun + FINEMAP were well calibrated and identified >20% more variants with a posterior causal probability >0.95 than identified in their nonfunctionally informed counterparts. In analyses of 49 UK Biobank traits (average n = 318,000), PolyFun + SuSiE identified 3,025 fine-mapped variant-trait pairs with posterior causal probability >0.95, a >32% improvement versus SuSiE. We used posterior mean per-SNP heritabilities from PolyFun + SuSiE to perform polygenic localization, constructing minimal sets of common SNPs causally explaining 50% of common SNP heritability; these sets ranged in size from 28 (hair color) to 3,400 (height) to 2 million (number of children). In conclusion, PolyFun prioritizes variants for functional follow-up and provides insights into complex trait architectures.


Assuntos
Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Genoma Humano/genética , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética
16.
Nat Commun ; 10(1): 4054, 2019 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-31492842

RESUMO

Transposable elements (TE) comprise roughly half of the human genome. Though initially derided as junk DNA, they have been widely hypothesized to contribute to the evolution of gene regulation. However, the contribution of TE to the genetic architecture of diseases remains unknown. Here, we analyze data from 41 independent diseases and complex traits to draw three conclusions. First, TE are uniquely informative for disease heritability. Despite overall depletion for heritability (54% of SNPs, 39 ± 2% of heritability), TE explain substantially more heritability than expected based on their depletion for known functional annotations. This implies that TE acquire function in ways that differ from known functional annotations. Second, older TE contribute more to disease heritability, consistent with acquiring biological function. Third, Short Interspersed Nuclear Elements (SINE) are far more enriched for blood traits than for other traits. Our results can help elucidate the biological roles that TE play in the genetic architecture of diseases.


Assuntos
Elementos de DNA Transponíveis/genética , Doença/genética , Regulação da Expressão Gênica , Genoma Humano/genética , Padrões de Herança/genética , Retroelementos/genética , Algoritmos , Doenças Autoimunes/sangue , Doenças Autoimunes/genética , Encefalopatias/sangue , Encefalopatias/genética , Evolução Molecular , Humanos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genética , Elementos Nucleotídeos Curtos e Dispersos/genética
17.
PLoS Genet ; 15(5): e1008124, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31071088

RESUMO

The rapid digitization of genealogical and medical records enables the assembly of extremely large pedigree records spanning millions of individuals and trillions of pairs of relatives. Such pedigrees provide the opportunity to investigate the sociological and epidemiological history of human populations in scales much larger than previously possible. Linear mixed models (LMMs) are routinely used to analyze extremely large animal and plant pedigrees for the purposes of selective breeding. However, LMMs have not been previously applied to analyze population-scale human family trees. Here, we present Sparse Cholesky factorIzation LMM (Sci-LMM), a modeling framework for studying population-scale family trees that combines techniques from the animal and plant breeding literature and from human genetics literature. The proposed framework can construct a matrix of relationships between trillions of pairs of individuals and fit the corresponding LMM in several hours. We demonstrate the capabilities of Sci-LMM via simulation studies and by estimating the heritability of longevity and of reproductive fitness (quantified via number of children) in a large pedigree spanning millions of individuals and over five centuries of human history. Sci-LMM provides a unified framework for investigating the epidemiological history of human populations via genealogical records.


Assuntos
Genealogia e Heráldica , Genética Populacional , Longevidade/genética , Modelos Genéticos , Linhagem , Animais , Simulação por Computador , Feminino , Aptidão Genética , Humanos , Modelos Lineares , Masculino , Plantas/genética
18.
Nat Commun ; 9(1): 4919, 2018 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-30464216

RESUMO

Testing for association between a set of genetic markers and a phenotype is a fundamental task in genetic studies. Standard approaches for heritability and set testing strongly rely on parametric models that make specific assumptions regarding phenotypic variability. Here, we show that resulting p-values may be inflated by up to 15 orders of magnitude, in a heritability study of methylation measurements, and in a heritability and expression quantitative trait loci analysis of gene expression profiles. We propose FEATHER, a method for fast permutation-based testing of marker sets and of heritability, which properly controls for false-positive results. FEATHER eliminated 47% of methylation sites found to be heritable by the parametric test, suggesting a substantial inflation of false-positive findings by alternative methods. Our approach can rapidly identify heritable phenotypes out of millions of phenotypes acquired via high-throughput technologies, does not suffer from model misspecification and is highly efficient.


Assuntos
Técnicas Genéticas , Característica Quantitativa Herdável , Estatística como Assunto , Metilação de DNA , Expressão Gênica , Fenótipo
19.
Nature ; 559(7714): 400-404, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29988082

RESUMO

The incidence of acute myeloid leukaemia (AML) increases with age and mortality exceeds 90% when diagnosed after age 65. Most cases arise without any detectable early symptoms and patients usually present with the acute complications of bone marrow failure1. The onset of such de novo AML cases is typically preceded by the accumulation of somatic mutations in preleukaemic haematopoietic stem and progenitor cells (HSPCs) that undergo clonal expansion2,3. However, recurrent AML mutations also accumulate in HSPCs during ageing of healthy individuals who do not develop AML, a phenomenon referred to as age-related clonal haematopoiesis (ARCH)4-8. Here we use deep sequencing to analyse genes that are recurrently mutated in AML to distinguish between individuals who have a high risk of developing AML and those with benign ARCH. We analysed peripheral blood cells from 95 individuals that were obtained on average 6.3 years before AML diagnosis (pre-AML group), together with 414 unselected age- and gender-matched individuals (control group). Pre-AML cases were distinct from controls and had more mutations per sample, higher variant allele frequencies, indicating greater clonal expansion, and showed enrichment of mutations in specific genes. Genetic parameters were used to derive a model that accurately predicted AML-free survival; this model was validated in an independent cohort of 29 pre-AML cases and 262 controls. Because AML is rare, we also developed an AML predictive model using a large electronic health record database that identified individuals at greater risk. Collectively our findings provide proof-of-concept that it is possible to discriminate ARCH from pre-AML many years before malignant transformation. This could in future enable earlier detection and monitoring, and may help to inform intervention.


Assuntos
Predisposição Genética para Doença , Saúde , Leucemia Mieloide Aguda/genética , Mutação , Adulto , Fatores Etários , Idoso , Progressão da Doença , Registros Eletrônicos de Saúde , Feminino , Humanos , Leucemia Mieloide Aguda/epidemiologia , Leucemia Mieloide Aguda/patologia , Masculino , Pessoa de Meia-Idade , Modelos Genéticos , Mutagênese , Prevalência , Medição de Risco
20.
Am J Hum Genet ; 103(1): 89-99, 2018 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-29979983

RESUMO

Methods that estimate SNP-based heritability and genetic correlations from genome-wide association studies have proven to be powerful tools for investigating the genetic architecture of common diseases and exposing unexpected relationships between disorders. Many relevant studies employ a case-control design, yet most methods are primarily geared toward analyzing quantitative traits. Here we investigate the validity of three common methods for estimating SNP-based heritability and genetic correlation between diseases. We find that the phenotype-correlation-genotype-correlation (PCGC) approach is the only method that can estimate both quantities accurately in the presence of important non-genetic risk factors, such as age and sex. We extend PCGC to work with arbitrary genetic architectures and with summary statistics that take the case-control sampling into account, and we demonstrate that our new method, PCGC-s, accurately estimates both SNP-based heritability and genetic correlations and can be applied to large datasets without requiring individual-level genotypic or phenotypic information. Finally, we use PCGC-s to estimate the genetic correlation between schizophrenia and bipolar disorder and demonstrate that previous estimates are biased, partially due to incorrect handling of sex as a strong risk factor.


Assuntos
Doença/genética , Polimorfismo de Nucleotídeo Único/genética , Estudos de Casos e Controles , Estudos de Associação Genética/métodos , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Modelos Genéticos , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...